Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last Batch Policy changes for file source reader #182

Merged
merged 613 commits into from
Sep 11, 2024

Conversation

swetha097
Copy link
Contributor

@swetha097 swetha097 commented Jul 17, 2024

  • Modify the way sharding is handled
  • Handle the various policies , pad_last_batch, stick_to_shard and shard_size.
  • A new vector is introduced to store the file_names of the complete dataset instead of just the current shard data to support stick_to_shard.
  • Pass the last batch policy and pad_last_batch_repeated to the loaders instead of the context creation.
  • Introduce a struct ShardingInfo to hold variables related to sharding - last_batch_policy, pad_last_batch_repeated, stick_to_shard and shard_size.

swetha097 and others added 30 commits April 8, 2024 06:22
@swetha097
Copy link
Contributor Author

@rrawther Resolved all the PR comments and tested with the Audio and Classification trainings.

@kiritigowda kiritigowda merged commit 87348ad into ROCm:develop Sep 11, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants